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DIRECTED EVOLUTION OF PROTEIN IN MAMMALIAN CELLS 

CROSS-REFERENCES TO RELATED APPLICATIONS 
The present application claims priority to USSN 60/291,871, filed May 18, 
5 200 1 , herein incorporated by reference in its entirety. 

STATEMENT AS TO RIGHTS TO INVENTIONS MADE UNDER 
FEDERALLY SPONSORED RESEARCH AND DEVELOPMENT 
Not applicable. 

10 

FIELD OF THE INVENTION 
The present invention relates to directed protein evolution in mammalian cells 
and improved mutants of Discosoma sp. red fluorescent proteins. 

1 5 BACKGROUND OF THE INVENTION 

Red fluorescent protein has been isolated from a Discosoma sp. and sequenced 
{see, e.g., Matz et al, Nature Biotech. 17:969-973 (1999), Gross et al., Proa Nafl Acad. ScL 
USA 97:1 1990-1 1995 (2000)). A variant with humanized codons has also been engineered 
(Clontech, "DsRED™"). The crystal structure of red fluorescent protein has been elucidated, 

20 which demonstrated that red fluorescent protein is a tetrameric protein (Wall et al, Nat 

Struc. Biol 7:1089 (2000); Yarbrough et al, Proc. Natl Acad. Sci USA 16:462-467 (2000)). 

Red fluorescent protein (RFP) and DsRED, as well as other fluorescent 
proteins such as YFP, or GFP from Aequorea victoria, Renilla reniformis, Renilla muelleri, 
and Ptilosarcus gurneyi, are useful are reporter molecules for a variety of bioassays, 

25 including those that use FACS as a selection mechanism (see, e.g., Tsein, Nature 

Biotechnology 17:956 (1999); Tsein, Ann. Rev. Biochem. 6:509-544 (1998); Heim et al, 
Nature 373:663-664 (1995); Heim et al, Proc. Natl Acad. Sci. USA 91:1250 (1994); Prasher 
et al, Gene 1 1 1:229 (1992); Prasher et al, Trends in Genetics 11:320 (1995); Chalfie et al, 
Science 263:802 (1994); and WO 95/21191). However, brighter, faster folding, and higher 

30 expressing variants would be useful. 

Such variants can be made, e.g., using methods of gene shuffling and 
mutagenesis (see, e.g., US Patent 5,811,238; WO 00/73433; WO 00/221 15; WO 99/41369; 
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WO 01/04287; WO 00/46344; WO 99/45143, WO 99/41368; and Ichiro et al t Protein 
Science 8:731-740 (1999)). However, the use of such methods for production of variant 
proteins such as Discosoma red fluorescent protein variants is not always successful (see, 
e.g., Baird et al. t Proc. Nat' I Acad. Sci. USA 97:1 1984-1 1989 (2000)). Novel methods of 
5 making such variants would therefore be useful. 

SUMMARY OF THE INVENTION 
The present invention therefore provides variants of Discosoma red 
fluorescent protein that have been generated using directed molecular evolution in 
10 mammalian cells. The variants of the invention have greatly improved brightness, 

expression, and/or folding kinetics as compared to wild type or a codon optimized variant. 
The present invention also provides novel methods of directed protein evolution in 
mammalian cells using retroviral gene transfer and FACS sorting. Such methods can be used 
to provide improved variants of fluorescent proteins such as Discosoma red fluorescent 
1 5 protein and fluorescent proteins from other sources, such as Aequorea victoria, Renilla 
reniformis, Renilla muelleri, and Ptilosarcus gurneyi. 

In one aspect, the present invention provides an isolated Discosoma red 
fluorescent protein, the protein comprising an amino acid sequence as shown in Figure 1 with 
one or more point mutations at an amino acid position selected from the group consisting of 
20 N24, F125, K164, and M183. 

In one embodiment, the protein comprises two, three, or four point mutations 
at an amino acid position selected from the group consisting of N24, F125, K164, and M183. 

In one embodiment, the point mutation at amino acid position N24 is a serine, 
arginine, or histidine substitution. In another embodiment, the point mutation at amino acid 
25 position F 125 is a leucine or valine substitution. In another embodiment, the point mutation 
at amino acid position K 164 is a methionine substitution. In another embodiment, the point 
mutation at amino acid position Ml 83 is a lysine or threonine substitution. 

In one embodiment, the protein comprises an amino acid sequence as shown 
in Figure 1 with a leucine or valine substitution at amino acid position F125 and a lysine 
30 substitution at amino acid position Ml 83. In another embodiment, the protein comprises an 
amino acid sequence as shown in Figure 1 with a leucine substitution at amino acid position 
F125 and a lysine substitution at amino acid position Ml 83. In another embodiment, the 
protein comprises an amino acid sequence as shown in Figure 1 with a valine substitution at 
amino acid position F125 and a lysine substitution at amino acid position Ml 83. 
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In one embodiment, the protein comprises an amino acid sequence as shown 
in Figure 1 with a leucine or valine substitution at amino acid position F125 and a serine, 
arginine, or histidine substitution at amino acid position N24. In another embodiment, the 
protein comprises an amino acid sequence as shown in Figure 1 with a leucine substitution at 
5 amino acid position F125 and a serine substitution at amino acid position N24. 

In one embodiment, the protein comprises an amino acid sequence as shown 
in Figure 1 with a leucine or valine substitution at amino acid position F125, a serine, 
arginine, or histidine substitution at amino acid position N24, and a lysine substitution at 
amino acid position Ml 83. In another embodiment, the protein comprises an amino acid 
10 sequence as shown in Figure 1 with a leucine substitution at amino acid position F125, a 
serine substitution at amino acid position N24, and a lysine substitution at amino acid 
position Ml 83. 

In one embodiment, the protein comprises an amino acid sequence as shown 
in Figure 1 with a methionine substitution at amino acid position K164. 
15 In one embodiment, the protein comprises an amino acid sequence as shown 

in Figure 1 with a leucine substitution at amino acid position F125. 

In one embodiment, the protein further comprises one or more point mutations 
at an amino acid position selected from the group consisting of K93, R19, K139, E150, and 
D171 . In another embodiment, the point mutation at amino acid position K93 is an arginine 
20 substitution. In another embodiment, the point mutation at amino acid position R19 is a 
histidine substitution. In another embodiment, the point mutation at amino acid position 
El 50 is an aspartic acid substitution. In another embodiment, the point mutation at amino 
acid position D171 is a glycine substitution. 

In one embodiment, the protein comprises an amino acid sequence as shown 
25 in Figure 1 with a leucine substitution at amino acid position F125, a serine substitution at 
amino acid position N24, a lysine substitution at amino acid position Ml 83, and a histidine 
substitution at amino acid position R19. 

In one embodiment, the protein comprises an amino acid sequence as shown 
in Figure 1 with a leucine substitution at amino acid position F125, a aspartic acid 
30 substitution at amino acid position El 50, and a glycine substitution at amino acid position 
D171. 

In one aspect, the present invention provides a Discosoma red fluorescent 
protein that is a fusion protein. 
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In another aspect, the present invention provides a nucleic acid encoding the 
Discosoma red fluorescent protein of the invention. In one embodiment, the nucleic acid is 
codon-optimized for mammalian expression. In another embodiment, the nucleic acid 
encodes a fusion protein. 
5 In another aspect, the present invention provides a vector comprising a nucleic 

acid encoding the Discosoma red fluorescent protein of the invention. In one embodiment, 
the vector is a retroviral vector. 

In another aspect, the present invention provides a host cell comprising the 
vector of the invention. 

10 In another aspect, the present invention provides a retroviral cDNA expression 

library comprising a nucleic acid encoding the Discosoma red fluorescent protein. 

In another aspect, the present invention provides a method of making a protein 
variant, the method comprising the steps of: (i) mutating a selected nucleotide sequence 
encoding a fluorescent protein; (ii) cloning the mutated sequences into an expression vector; 

1 5 (iii) transfecting mammalian cells with the expression vector; and (iv) identifying the 
variants. 

In one embodiment, the protein is a fluorescent protein and variants are 
identified by FACS analysis. In another embodiment, the selected nucleotide sequence 
encodes a fluorescent protein from Discosoma "red" sp., Aequorea victoria, Renilla 
20 reniformis, Renilla muelleri, or Ptilosarcus gurneyi. 

In one embodiment, the selected nucleotide sequence is mutated using error 

prone PCR. 

In another embodiment, the expression vector is a retroviral expression vector. 
In another aspect, the present invention provides a method of making a 
25 fluorescent protein variant, the method comprising the steps of: (i) mutating by error prone 
PCR a selected nucleotide sequence encoding a fluorescent protein; (ii) cloning the mutated 
sequences into a retroviral expression vector; (iii) transfecting mammalian cells with the 
expression vector; and (iv) selecting variants using FACS analysis. 

In one embodiment, the selected nucleotide sequence encodes a fluorescent 
30 protein from Discosoma "red" sp., Aequorea victoria, Renilla reniformis t Renilla muelleri, 
or Ptilosarcus gurneyi. 
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BRIEF DESCRIPTION OF THE DRAWINGS 
Figure 1 provides the amino acid and nucleotide sequence of a mammalian 
codon optimized Discosoma red fluorescent protein. This figure also indicates preferred 
point mutations in the amino acid sequence for variants. 
5 Figure 2 provides examples of brighter Discosoma red fluorescent protein 

variants. 

Figure 3 provides a list of mutated Discosoma red fluorescent proteins isolated 
using the mammalian directed evolution methods of the invention. 

Figure 4 shows excitation and emission spectra of certain mutants of the 

10 invention. 

Figure 5 shows a diagram of methods for directed evolution of proteins in 
mammalian cells. 



DETAILED DESCRIPTION OF THE INVENTION 
15 The present invention provides variants of Discosoma red fluorescent protein, 

which have enhanced brightness, expression, and/or folding kinetics. These improved 
characteristics are useful for functional screens as a reporter for gene transcription (e.g., as a 
fusion protein), for target characterization and localization of fusion proteins, and for 
scaffolds for protein and peptide libraries. For example, variants of the invention can be 
20 cloned into expression vectors that are used to express cDNA or random peptide libraries. 
The variant is positioned in the vector such that it forms a fusion protein with the expressed 
cDNA or peptide. The cDNA library can comprise sense, antisense, full length, and 
truncated cDNAs. The peptide library is encoded by nucleic acids. cDNA libraries are made 
from any suitable RNA source. Libraries encoding random peptides are made according to 
25 techniques well known to those of skill in the art (see, e.g., U.S. Patent No. 6,153,380, 
6,1 14,1 1 1, and 6,180,343). Any suitable vector can be used for the cDNA and peptide 
libraries, including, e.g., retroviral vectors. The Discosoma variant can thus be used as a 
selectable marker. 

Red fluorescent protein is generally useful for screens employing FACS 
30 assays. Red fluorescent protein is also useful in screens for reporter gene transcription, 
fusion protein localization, yeast two hybrid experiments, immunoprecipitation and 
proteomics, increased affinity of receptors for fluorescently labeled ligands, proteins which 
increase the expression level of a second protein, altered immunogenicity for fluorescently 
labeled antibodies, changes in cell shape and size, changes in proton pump activity, relative 
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DNA content in cell cycle and apoptosis, cellular localization and changes in metabolic rates 
of calcium flux, cell division, mitochondrial activity, pH, and free radical production. Such 
assays are useful for identifying proteins involved in the cell cycle, cellular proliferation, 
lymphocyte activation, ubiquitination pathways, cancer, mast cell degranulation, viral 
5 replication and translation (e.g., HCV) and angiogenesis. In addition to red fluorescent 
protein, such screens can also use one or more additional fluorescent protein, such as 
Aequorea victoria GFP, Zoanthus YFP and GFP, Aneomonia CFP, Clavularia CFP, D. 
striata CFP, Renilla muelleri GFP, Renilla reniformis GFP, and Ptilosarcus gurneyi GFP, 
and variants thereof. 

10 In the present invention, novel methods of directed protein evolution were 

used to obtain improved variants of red fluorescent protein, as well as other proteins, 
including other fluorescent proteins as described above. In the methods of the invention, 
error prone PCR is used to randomly mutagenize a nucleic acid sequence encoding a protein 
of interest (see, e.g., Leung et al, Techniques 1:11-15 (1989); Calwell & Joyce, PCR 

15 Methods and Applications, 2:28-33 (1992); and Grarnm et al, Proc. Natl Acad, Scu USA 
89:3576-3580 (1992)). The inherently low fidelity of Taq polymerase or other thermostable 
polymerases can be further decreased by the addition of Mn+ 5 increasing the Mg2+ 
concentration, and using unequal dNTP concentrations. A preferred method of EP-PCR is 
described in Calwell & Joyce, PCR Methods and Applications, 2:28-33 (1992) and in Current 

20 Protocols, supra. Alternatively, other well know mutagenesis methods such as gene 

shuffling could be employed (see, e.g., US Patent 5, 811,238, WO 99/41369, WO 99/41368, 
and WO 00/46344). The library of variant nucleic acids is then transferred to mammalian 
cells (e.g., Jurkat, A549, Phoenix A, or BJAB) using retroviral vectors. Variants are detected 
by any suitable assay, e.g., in the case a fluorescent protein, by FACS. Clones of interest are 

25 then rescued and isolated. As described in Figure 1 , this technique was used to identify four 
preferred sites of point mutations (amino acid substitutions) that lead to red fluorescent 
proteins with enhanced brightness, altered emission, higher expression, and/or enhanced 
folding kinetics (3° or 4° structure). 



30 Definitions 

The term "point mutation" refers to a deletion, addition, or substitution at a 
designed amino acid position in an amino acid or nucleotide sequence. Preferably, the term 
refers to an amino acid substitution. 
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"Discosoma red fluorescent protein" refers to a wild-type protein isolated from 
Discosoma species "red" (described and sequenced in Matz et al, Nature Biotechnology 
17:969-973 (1999)), as well as a mammalian codon-optimized variant shown in Figure 1. 

"Nucleic acid" refers to deoxyribonucleotides or ribonucleotides and polymers 
5 thereof in single- or double-stranded form, or complements thereof. The term encompasses 
nucleic acids containing known nucleotide analogs or modified backbone residues or 
linkages, which are synthetic, naturally occurring, and non-naturally occurring, which have 
similar binding properties as the reference nucleic acid, and which are metabolized in a 
manner similar to the reference nucleotides. Examples of such analogs include, without 

1 0 limitation, phosphorothioates, phosphoramidates, methyl phosphonates, chiral-methyl 

phosphonates, 2-O-methyl ribonucleotides, peptide-nucleic acids (PNAs). Nucleic acids also 
include complementary nucleic acids. 

Unless otherwise indicated, a particular nucleic acid sequence also implicitly 
encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions) 

1 5 and complementary sequences, as well as the sequence explicitly indicated. Specifically, 
degenerate codon substitutions may be achieved by generating sequences in which the third 
position of one or more selected (or all) codons is substituted with mixed-base and/or 
deoxyinosine residues (Batzer et aL, Nucleic Acid Res. 19:5081 (1991); Ohtsuka et al, J. 
Biol. Chem. 260:2605-2608 (1985); Rossolini etal, Mol Cell Probes 8:91-98 (1994)). The 

20 term nucleic acid is used interchangeably with gene, cDNA, mRNA, oligonucleotide, and 
polynucleotide. 

A particular nucleic acid sequence also implicitly encompasses "splice 
variants." Similarly, a particular protein encoded by a nucleic acid implicitly encompasses 
any protein encoded by a splice variant of that nucleic acid. "Splice variants," as the name 

25 suggests, are products of alternative splicing of a gene. After transcription, an initial nucleic 
acid transcript may be spliced such that different (alternate) nucleic acid splice products 
encode different polypeptides. Mechanisms for the production of splice variants vary, but 
include alternate splicing of exons. Alternate polypeptides derived from the same nucleic 
acid by read-through transcription are also encompassed by this definition. Any products of a 

30 splicing reaction, including recombinant forms of the splice products, are included in this 
definition. 

The terms "polypeptide," "peptide" and "protein" are used interchangeably 
herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers 
in which one or more amino acid residue is an artificial chemical mimetic of a corresponding 
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naturally occurring amino acid, as well as to naturally occurring amino acid polymers and 
non-naturally occurring amino acid polymer. 

The term "amino acid" refers to naturally occurring and synthetic amino acids, 
as well as amino acid analogs and amino acid mimetics that function in a manner similar to 
5 the naturally occurring amino acids. Naturally occurring amino acids are those encoded by 
the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, 7- 
carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to compounds that have 
the same basic chemical structure as a naturally occurring amino acid, i.e., an a carbon that is 
bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, 

10 norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs have modified 
R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical 
structure as a naturally occurring amino acid. Amino acid mimetics refers to chemical 
compounds that have a structure that is different from the general chemical structure of an 
amino acid, but that functions in a manner similar to a naturally occurring amino acid. 

1 5 Amino acids may be referred to herein by either their commonly known three 

letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical 
Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly 
accepted single-letter codes. 

"Conservatively modified variants" applies to both amino acid and nucleic 

20 acid sequences. With respect to particular nucleic acid sequences, conservatively modified 
variants refers to those nucleic acids which encode identical or essentially identical amino 
acid sequences, or where the nucleic acid does not encode an amino acid sequence, to 
essentially identical sequences. Because of the degeneracy of the genetic code, a large 
number of functionally identical nucleic acids encode any given protein. For instance, the 

25 codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every 
position where an alanine is specified by a codon, the codon can be altered to any of the 
corresponding codons described without altering the encoded polypeptide. Such nucleic acid 
variations are "silent variations," which are one species of conservatively modified 
variations. Every nucleic acid sequence herein which encodes a polypeptide also describes 

30 every possible silent variation of the nucleic acid. One of skill will recognize that each codon 
in a nucleic acid (except AUG, which is ordinarily the only codon for methionine, and TGG, 
which is ordinarily the only codon for tryptophan) can be modified to yield a functionally 
identical molecule. Accordingly, each silent variation of a nucleic acid which encodes a 
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polypeptide is implicit in each described sequence with respect to the expression product, but 
not with respect to actual probe sequences. 

As to amino acid sequences, one of skill will recognize that individual 
substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein 
5 sequence which alters, adds or deletes a single amino acid or a small percentage of amino 
acids in the encoded sequence is a "conservatively modified variant" where the alteration 
results in the substitution of an amino acid with a chemically similar amino acid. 
Conservative substitution tables providing functionally similar amino acids are well known in 
the art. Such conservatively modified variants are in addition to and do not exclude 

10 polymorphic variants, interspecies homologs, and alleles of the invention. 

The following eight groups each contain amino acids that are conservative 
substitutions for one another: 1) Alanine (A), Glycine (G); 2) Aspartic acid (D), Glutamic 
acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), 
Leucine (L), Methionine (M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), Tryptophan 

15 (W); 7) Serine (S), Threonine (T); and 8) Cysteine (C), Methionine (M) (see, e.g., Creighton, 
Proteins (1984)). 

The term "recombinant" when used with reference, e.g., to a cell, or nucleic 
acid, protein, or vector, indicates that the cell, nucleic acid, protein or vector, has been 
modified by the introduction of a heterologous nucleic acid or protein or the alteration of a 

20 native nucleic acid or protein, or that the cell is derived from a cell so modified. Thus, for 
example, recombinant cells express genes that are not found within the native (non- 
recombinant) form of the cell or express native genes that are otherwise abnormally 
expressed, under expressed or not expressed at all. 

The term "heterologous" when used with reference to portions of a nucleic 

25 acid indicates that the nucleic acid comprises two or more subsequences that are not found in 
the same relationship to each other in nature. For instance, the nucleic acid is typically 
recombinantly produced, having two or more sequences from unrelated genes arranged to 
make a new functional nucleic acid, e.g., a promoter from one source and a coding region 
from another source. Similarly, a heterologous protein indicates that the protein comprises 

30 two or more subsequences that are not found in the same relationship to each other in nature 
(e.g., a fusion protein). 

A "fluorescent" label may be detected by exciting the fluorochrome with the 
appropriate wavelength of light and detecting the resulting fluorescence. The fluorescence 
may be detected visually, or by the use of electronic detectors such as charge coupled devices 
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(CCDs) or photomultipliers and the like. Similarly, enzymatic labels may be detected by 
providing the appropriate substrates for the enzyme and detecting the resulting reaction 
product. FACS analysis is a preferred method of detection when the label is in a cell. 

5 EXAMPLES 

The following example is provided by way of illustration only and not by way 
of limitation. Those of skill in the art will readily recognize a variety of noncritical 
parameters that could be changed or modified to yield essentially similar results. 

10 Example 1 : Error prone PCR and directed evolution of Discosoma red species fluorescent 
proteins in mammalian cells 

To mutagenize Discosoma red fluorescent protein, a mammalian codon 
optimized variant (see Figure 1) was cloned with a flag tag and mutagenized using error 
prone PCR according to methods known to those of skill in the art (see, e.g., Current 

15 Protocols in Molecular Biology, volume 1, unit 8.3 (Ausubel et aL, eds, 1994); Saiki et al, 
Science 239:487 (1988); Leung etal, Technique 1:11-15 (1989); Caldwell & Joyce, PCR 
Methods and Applications 2:28-33 (1992); and Gramm et al, Proc. Nat 'I Acad. Set USA 
89:3576-3580 (1992)). 

The resulting library of mutagenized sequences was cloned into a retroviral 

20 vector expression library using RT-PCR and the retroviral library was used to infect human 
cells (BJAB cells). The cells were sorted for brighter fluorescence, higher expression, or 
shifted emission. Selected clones were isolated using RT-PCR, and sub-libraries were 
constructed and selected further with FACS (see Figure 5). Single cell clones were isolated 
and sequenced. 

25 Figure 2 lists some of the brighter mutants identified in the screen (note: 

amino acid sequences are off by one from the sequence numbering described in Figure 1 as 
the methionine was counted as zero for the purposes of the numbering in Figure 2). Figure 1 
lists certain preferred mutations at amino acid positions N24, F125, K164 and Ml 83, e.g., 
N24S/R/H; K125L/V; K164M; and M183K. These mutations can exist in the mutated 

30 variants alone or in any combination of one , two , three, or four, or optionally with additional 
point mutations at K93, R19, K139, E150, and D171, e.g., K93R, R19H, E150D, and D171G. 
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All publications and patent applications cited in this specification are herein 
incorporated by reference as if each individual publication or patent application were 
specifically and individually indicated to be incorporated by reference. 

Although the foregoing invention has been described in some detail by way of 
5 illustration and example for purposes of clarity of understanding, it will be readily apparent to 
one of ordinary skill in the art in light of the teachings of this invention that certain changes 
and modifications may be made thereto without departing from the spirit or scope of the 
appended claims. 
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1 1. An isolated Discosoma red fluorescent protein, the protein 

2 comprising an amino acid sequence as shown in Figure 1 with one or more point 

3 mutations at an amino acid position selected from the group consisting of N24, F125, 

4 K164, andM183. 

1 2. The protein of claim 1, wherein the protein comprises two, three, or 

2 four point mutations at an amino acid position selected from the group consisting of N24, 

3 F125, K164, andM183. 

1 3. The protein of claim 1, wherein the point mutation at amino acid 

2 position N24 is a serine, arginine, or histidine substitution. 

1 4. The protein of claim 1 , wherein the point mutation at amino acid 

2 position F 125 is a leucine or valine substitution. 

1 5 . The protein of claim 1 , wherein the point mutation at amino acid 

2 position K164 is a methionine substitution. 

1 6. The protein of claim 1 , wherein the point mutation at amino acid 

2 position Ml 83 is a lysine or threonine substitution. 

1 7. The protein of claim 1 , wherein the protein comprises an amino 

2 acid sequence as shown in Figure 1 with a leucine or valine substitution at amino acid 

3 position F125 and a lysine substitution at amino acid position Ml 83. 

1 8. The protein of claim 1, wherein the protein comprises an amino 

2 acid sequence as shown in Figure 1 with a leucine substitution at amino acid position 

3 F125 and a lysine substitution at amino acid position M183. 

1 9. The protein of claim 1 , wherein the protein comprises an amino 

2 acid sequence as shown in Figure 1 with a valine substitution at amino acid position F125 

3 and a lysine substitution at amino acid position Ml 83. 

1 10. The protein of claim 1, wherein the protein comprises an amino 

2 acid sequence as shown in Figure 1 with a leucine or valine substitution at amino acid 

3 position F125 and a serine, arginine, or histidine substitution at amino acid position N24. 
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1 11. The protein of claim 1 , wherein the protein comprises an amino 

2 acid sequence as shown in Figure 1 with a leucine substitution at amino acid position 

3 F125 and a serine substitution at amino acid position N24. 

1 1 2. The protein of claim 1 , wherein the protein comprises an amino 

2 acid sequence as shown in Figure 1 with a leucine or valine substitution at amino acid 

3 position F125, a serine, arginine, or histidine substitution at amino acid position N24, and 

4 a lysine substitution at amino acid position Ml 83. 

1 13. The protein of claim 1, wherein the protein comprises an amino 

2 acid sequence as shown in Figure 1 with a leucine substitution at amino acid position 

3 F125, a serine substitution at amino acid position N24, and a lysine substitution at amino 

4 acid position Ml 83. 

1 14. The protein of claim 1, wherein the protein comprises an amino 

2 acid sequence as shown in Figure 1 with a methionine substitution at amino acid position 

3 K164. 

1 15. The protein of claim 1, wherein the protein comprises an amino 

2 acid sequence as shown in Figure 1 with a leucine substitution at amino acid position 

3 F125. 

1 16. The protein of claim 1, further comprising one or more point 

2 mutations at an amino acid position selected from the group consisting of K93, R19, 

3 K139,E150, andD171. 

1 17. The protein of claim 16, wherein the point mutation at amino acid 

2 position K93 is an arginine substitution. 

1 18. The protein of claim 16, wherein the point mutation at amino acid 

2 position R19 is a histidine substitution. 

1 19. The protein of claim 16, wherein the point mutation at amino acid 

2 position El 50 is an aspartic acid substitution. 

1 20. The protein of claim 16, wherein the point mutation at amino acid 

2 position D171 is a glycine substitution. 
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1 21. The protein of claim 18, wherein the protein comprises an amino 

2 acid sequence as shown in Figure 1 with a leucine substitution at amino acid position 

3 F125, a serine substitution at amino acid position N24, a lysine substitution at amino acid 

4 position Ml 83, and a histidine substitution at amino acid position R19. 

1 22. The protein of claim 19, wherein the protein comprises an amino 

2 acid sequence as shown in Figure 1 with a leucine substitution at amino acid position 

3 F125, a aspartic acid substitution at amino acid position E150, and a glycine substitution 

4 at amino acid position D 1 7 1 . 

1 23. A fusion protein comprising the protein of claim 1 , 

1 24. An isolated nucleic acid encoding the protein of claim 1 . 

1 25. The nucleic acid of claim 24, wherein the nucleic acid is codon- 

2 optimized for mammalian expression. 

1 26. The nucleic acid of claim 24, wherein the nucleic acid encodes a 

2 fusion protein, 

1 27. A vector comprising the nucleic acid of claim 24, 

1 28. The vector of claim 27, wherein the vector is a retroviral vector. 

1 29. A host cell comprising the vector of claim 27. 

1 30. A retroviral cDNA expression library comprising the nucleic acid 

2 of claim 24. 

1 3 1 . A retroviral cDNA expression library encoding the protein of claim 

2 1. 

1 32. A method of making a fluorescent protein variant, the method 

2 comprising the steps of: 

3 (i) mutating a selected nucleotide sequence encoding a fluorescent protein; 

4 (ii) cloning the mutated sequences into an expression vector; 

5 (iii) transfecting mammalian cells with the expression vector; and 

6 (iv) selecting variants using FACS analysis. 
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1 33 . The method of claim 32, wherein the selected nucleotide sequence 

2 is mutated using error prone PGR. 

1 34. The method of claim 32, wherein the expression vector is a 

2 retroviral expression vector. 

1 35 . The method of claim 32, wherein the selected nucleotide sequence 

2 encodes a fluorescent protein from Discosoma "red" sp.,Aequorea victoria, Renilla 

3 reniformis, Renilla muelleri, or Ptilosarcus gurneyi. 

1 36. A method of making a fluorescent protein variant, the method 

2 comprising the steps of: 

3 (i) mutating by error prone PCR a selected nucleotide sequence encoding a 

4 fluorescent protein; 

5 (ii) cloning the mutated sequences into a retroviral expression vector; 

6 (iii) transfecting mammalian cells with the expression vector; and 

7 (iv) selecting variants using FACS analysis. 

8 37. The method of claim 36, wherein the selected nucleotide sequence 

9 encodes a fluorescent protein from Discosoma "red" sp.,Aequorea victoria, Renilla 
10 reniformis, Renilla muelleri, or Ptilosarcus gurneyi. 
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Figure 1: Location of point mutations which enhance fluorescence of DsRED in 
mammalain cells 
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List of confirmed Mutations from screen: 
N24S, N24R, N24H (orange box) 
F125L, and F125V (blue box) 
K164M (green box) 
M183K (red box) 
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Figure 2. Examples of Brighter DsRED mutants. 

Individual mutant sequences (mutations shown in red) were infected into a 
population of B JAB ells and analyzed by FACS 48hours after infection. Wild type 
DsREDcFlag is shown in the bottom left and exhibits no FL2 fluorescence above 
background autofluorescence. Since the infection rate was realtively low, most cells 
express the DsRED variant from a single integrated retrovirus. All histograms have 
FL2 (normally measures PE on the FACS) on the x-axis. For reference, GFP is 
measured in FL1. 
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